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PERFORMANCE vs. PAPER- AND -PENCIL ESTIMATES 
OF COGNITIVE ABILITIES 



James K. Arima 



Naval Postgraduate School 
Monterey, California 93940 



INTRODUCTION 

The use of traditional, psychometr ically created, paper-and-pencil tests 
for selection has come under considerable criticism in recent times. One domi- 
nant source of this critical appraisal is equal employment opportunity legisla- 
tion and the court decisions that have followed. The tests have been criticized 
for their cultural bias, and even when they have been shown to be equally valid 
for various ethnic or socioeconomic groups in the job context, their continued 
use has been decried on the basis of the adverse impact that results. Another 
source of criticism has been politically motivated actions capitalizing on the 
distrust and dislike of objective tests by a segment of the general public. 

This activity has resulted in the banning of mass testing for pupil classifica- 
tion in California and the so-called "truth in testing" legislation passed in 
New York (Smith, 1979). Finally, questioning of the construct validity and 
ecological relevance of factorially developed tests has come from the lack of 
intersection between test constructs and findings in the rapidly developing area 
of cognitive psychology (Carroll & Maxwell, 1979; Sternberg, 1979). This last 
basis for criticism is particularly important to the psychological profession 
as it points out the distinction made years ago by Cronbach (1957) of the two 
disciplines of scientific psychology — the correlation and the experimental 
approaches . 

Taking cognizance of these trends, an earlier effort attempted to create a 
performance test that was practical to administer, had high construct validity, 
was culture free, and would provide results that could be broadly generalized 
(Arima, 1978; Young, 1975). In addition, an important consideration in creating 
the test was to measure an ability that was not being sufficiently assessed by 
conventional testing procedures and that would simultaneously provide a new 
dimension for making selection decisions. Accomplishing this could increase the 
selection pool and provide opportunities for individuals who might have been 
eliminated by conventional procedures. The new dimension was learning aptitude, 
defined as the ability to profit from experience. Broadly defined in this man- 
ner, learning ability has been proposed as an important indicator of intelligence 
and that higher levels of intelligence would be demonstrated by the ability to 
learn a fixed amount of material in a shorter time or a larger amount of material 
in a fixed period of time (Estes, 1974). Learning ability, manifested by such 
measures as grade point average, has been frequently used as a dependent variable 
in traditional test research, but the format and procedures of paper-and-pencil 
tests have made it impractical to use learning as an independent or selection 
variable. On the other hand, simple learning tasks have been extensively used 
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and validated in comparative psychology (Bitterman, 1975). Validation in this 
context has been the demonstration of reliably different levels of performance 
in humans by age or in animals by the phylogenetic hierarchy (Jensen, 1979) . 

The test, itself, was a discrimination-learning task in which pairs of 
random forms were presented sequentially to subjects. One member of a pair was 
arbitrarily designated as the correct alternative, which the subject learned to 
identify on the basis of positive reinforcement whenever a correct choice was 
made. Six different pairs made up a list, and their presentation, a trial. In 
all, 10 trials were given with the item pairs appearing in different random orders 
in each of the trials. The test was administered in a machine-paced and a self- 
paced mode to Navy recruits undergoing basic training. 

Significant amounts of learning took place over the 10 trials, and the corre- 
lation between odd and even trials showed a reliability of .838 when corrected 
for a test of full length using the Spearman-Brown formula. There was a low, 
but significant correlation (r = .27, N = 137) between the discrimination-learn- 
ing test scores and the Armed Forces Qualification Test (AFQT) scores attained 
by the subjects in their entrance testing. When the total group was split into 
white and nonwhite subjects, only the correlation for the white subjects 
(r = .223, N = 104) reached statistical significance at the .05 level. Thus, 
it appeared that the performance measure might be giving an assessment of the 
true capability of the nonwhite subjects which the verbal AFQT score failed to 
accomplish. Since, however, the correlation was .213 for the 33 nonwhite sub- 
jects, its lack of significance might have been due to smaller sample size. 

There was also a significant difference on the learning test between white and 
nonwhite subjects using the machine-paced mode, but not in the self-paced mode. 
However, the interaction term of ethnic grouping and presentation mode had a 
probability between .10 and .20 in the analysis of variance of learning test 
scores, so the differential effects of presentation mode for the racial group- 
ings was not fully confirmed. 

The present effort was a continuation of the original project that was moti- 
vated by several reasons. First, the learning test was reconfigured to make it 
more portable and simple to administer. It was made into a self-paced mode using 
a correction procedure so that selection of only the correct alternative automat- 
ically advanced the test to the next pair of items. These changes required a 
tryout and comparison of the results with the previous findings. There was a 
desire to see if the lack of a difference in performance between whites and non- 
whites would hold up in the self-paced mode using the reconfigured test. There 
was also a severe restriction in range in the earlier study because the subjects 
had been selected for service using the AFQT score as a screen. An unselected 
group was desired for whom the scores of the entire selection battery would be 
available for comparison with the discrimination-learning test score. 



METHOD 



Test Modifications 



The test, as developed for the original study (Arima, 1978), had three 
stimulus "lists” that were presented to individuals and scored by means of a 
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set of "off the shelf" laboratory equipment. It was basically a machine-paced 
test, and the subject had to press a button to advance the stimuli in the self- 
paced mode. The equipment was cumbersome and large and required considerable 
effort to set up. The objectives of the test modifications were to make it 
simple and portable and to run automatically in a self-paced mode. 

Since there was no great effect for similarity of stimuli within or 
between the lists in the original study, stimulus list 1 from the original 
study was selected. This list (Fig. 1) was constructed to have the least amount 
of similarity between the stimuli in each pair and among the pairs of the list. 

One member of each pair was randomly designated as the correct choice. 

The basic equipment for the reconfigured test was an SR-400 Stimulus- 
Response (S-R) Programmer made by Behavioral Controls, Inc. (BCI) . The SR-400 
has four clear-plastic panels that can be used to present visual stimuli and 
also serve as the response keys. Stimuli are presented by means of a fan- 
folded continuous strip of paper that can be programmed to control each of the 
four channels. It is essentially a sophisticated "teaching machine." In this 
application, only the two central panels were used, and the other two were 
blacked out and deactivated. 

As previously, 10 different versions of the stimulus list were made in 
which the order of the pairs was different, and each member of a stimulus pair 
randomly occupied the right or left position an equal number of times over all 
10 versions. The 10 lists were connected into one continuous sequence with the 
restriction that any one pair did not appear back-to-back. The lists were 
physically created by pasting the appropriate random figures to the designated 
position (right or left) on a sheet of the continuous, fan-folded paper. Each 
pair was coded for the correct response by punching the appropriate channel of 
the control segment of the sheet. This was done for the 60 stimulus pairs that 
constituted the entire, 10-list sequence. 

In operation, the SR-400 was programmed to advance to the next stimulus 
pair when the correct panel (stimulus) had been pressed. Thus, a correction 
method was used for the reinforcement — i.e., the subject had to make a correct 
response before the paper would move. A BSI counter incorporated into the setup 
through a BCI Four-Choice Auxiliary Control Console cumulated correct and incor- 
rect responses, and a timer mounted on the control console cumulated viewing 
time. (It did not move during the time the programmer was cycling to a new 
pair.) A stepping counter was built into the rear of the counter to buzz when 
six consecutive correct responses were made, but it became unreliable and was 
not used in test runs. The cycle time between stimulus pairs was 1.4 sec., and 
the equipment was programmed to stop at the end of the 10-list sequence. 

Subjects 

Subjects were obtained through three high schools in Monterey County, Cali- 
fornia, that participated in the high school testing program of the Defense 
Department. In this program, the Armed Services Vocational Aptitude Battery 
(ASVAB) is administered as a service without cost to high schools for vocational 
counseling. The results of the testing go initially to the high school counselor, 
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Figure 1. Test Figures 



but copies also go to recruiters in the area of the participating schools. 
Utilizing this source of subjects made it possible to compare learning perfor- 
mance with psychometric test measures in a relatively unselected population, 
which was one of the purposes of this study. The 65 students with ASVAB scores 
who were made available for this effort were divided by sex and ethnic grouping 
as shown in Table 1. The nonwhites were Hispanic (11), black (1), Filipino (2), 
Oriental (4), Native American (1), and other (3). The subjects came from grades 
9 through 12 with the average being 10.7. They ranged in age from 14 through 18 
with an average age of 16.2 years. 



Table 1 
Subjects 



Ethnic Group 


Male 


Female 


Total 


White 


17 


26 


43 


Nonwhite 


11 


11 


22 


(Total) 


28 


37 


65 



ASVAB 



The ASVAB used in the high school testing program was the version identified 
as Form 5. The tests of the battery, along with their length and reliability, 
are shown in Table 2. The General Information test includes items of common 
knowledge that individuals could pick up casually. It was included to provide 
a measure of the ability of subjects who do not do well in the remainder of the 
battery, especially those coming from socially deprived environments. Attention 
to Detail (AD), a perceptual speed test, and Numerical Operations are designed 
to evaluate potential clerical workers. The Electronic (El), Shop (SI), and Auto- 
motive Information (AI) tests are trade-type tests to identify individuals who 
already have some capability in these areas or whose familiarity with the material 
serves as an indication of their interest in this type of work. The other tests 
are assessments of cognitive skills and stored knowledge. The Armed Forces Quali- 
fication Test (AFQT) score is a linear combination of the Word Knowledge (WK) , 
Arithmetic Reasoning (AR) , and Space Perception (SP) tests normed on the World 
War II mobilization population. It has a reliability of .93 (Jensen, et al. , 1977). 
The utilization of the ASVAB in high schools for counseling has been criticized 
by Cronbach (1979) because it is essentially a selection and placement test as 
used by the Armed Forces. The Armed Forces Vocational Testing Group has attempted 
to create composites and provide norms using the relevant population to make it 
more acceptable for counseling in the high schools while still retaining its pri- 
mary purpose for the military (U.S. Military Enlisted Processing Command, undated). 

Procedure 



The test equipment, now quite portable, was set up in the schools where the 
subjects were available for testing. The instructions were provided to small 
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groups of four or less, but subjects were run in private. The mechanics of the 
test were explained in the instructions, along with advice that the test was 
being used for research purposes only and that it was not a timed test but the 
subject should work quickly without rushing. After the subject 1 s task had been 
described, they were shown a two-item test not using the figures in the record 
test to demonstrate how the test would be run and to acquaint the subject with 
nonsense figures. The subjects were then run individually. Once the first stim- 
ulus was presented, the test ran continuously with no apparent break until the 
60th frame had been processed. 



Table 2 

Subtests of the ASVAB Form 5 





Name of Test 


Number of 
Items 


Subtest 

Reliabilities* 



(GI) 


General Information 


15 


.67 


(NO) 


Numerical Operations 


50 


.88 


(AD) 


Attention to Detail 


30 


.82 


(WK) 


Word Knowledge 


30 


.91 


(AR) 


Arithmetic Reasoning 


20 


.82 


(SP) 


Space Perception 


20 


.82 


(MK) 


Mathematical Knowledge 


20 


.88 


(El) 


Electronic Information 


30 


.87 


(MC) 


Mechanical Comprehension 


20 


.81 


(GS) 


General Science 


20 


.77 


(SI) 


Shop Information 


20 


.83 


(AI) 


Automotive Information 


20 


.84 



* 

The data are from Jensen, et al., (1977). The reliabilities were 
derived using Kuder-Richardson Formula 20 with the exception of 
Numerical Operations and Attention to Detail, which were obtained 
by test-retest methods using ASVAB Form 6. 



RESULTS 



The total exposure time of the stimuli ranged from 35.5 to 161.1 sec. with 
a mean exposure time of 79.1 sec. Incorporating the 1.4-sec. cycle time between 
stimuli, the individual administration of the test required an average of 2.7 
min. Since all subjects were administered 60 stimulus pairs, those with the 
shortest exposure times were averaging a little over .5 sec. per frame. Speed 
on the test could be a characteristic of quick learning or a rapid response set. 
The latter might be the result of negative motivational factors induced by telling 
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the subjects that the test was being given for strictly research purposes. 
Questions about the role of rate of responding carried considerable concern, 
since the scoring of the test was in terms of the number of correct responses 
per unit of viewing time. This was called the Information Processing Rate (IPR) , 
since each stimulus pair carried one bit of information. The correlation between 
the number correct and viewing time was -.73, which was significant at the .01 
level. This indicated that the individuals who learned more required less time. 
Accordingly, it was concluded that subjects were motivated to perform well and 
that quick responding was, as originally hypothesized, an indication of rapid 
learning . 

The means and standard deviation on all subtests of the ASVAR, the AFQT 
composite, and the IPR are shown in Table 3 by sex, ethnic group, and the entire 
sample. The IPR was multiplied by 1,000 for convenience in displaying the ratio. 

At the .05 significance level, there were no male-female differences in IPR 
scores for the total sample or the subsamples. There were significant differences 
between all whites and nonwhites (t_ = 2.20) and between white and nonwhite 
females (_t = 2.30). The difference of 72.42 in the mean scores of white and 
nonwhite males did not reach statistical significance. Thus, it appears that 
there are white-nonwhite differences in IPR performance, and that this dif- 
ference was due primarily to differences between females of the two groups. 

On the AFQT, there was a significant difference in mean scores between males 
and females at the .05 level for only the white subjects (t = 2.26). No dif- 
ferences were found between all males and females and between nonwhite males and 
females. There were significant white-nonwhite differences in mean AFQT scores 
for all categories of subjects. The white-nonwhite difference for all subjects 
was significant at the .01 level (t = 3.00). The differences between white and 
nonwhite males (_t = 2.44) and between white and nonwhite females (_t = 2.10) were 
significant at the .05 level. To summarize, there are consistent differences 
between all white and nonwhite groupings on the AFQT dimension. The only sex- 
related difference occurs between male and female whites. 

Because of the differences in the sizes of the subsamples, the ^-test was 
used to assess the differences for each contrast rather than an analysis of 
variance incorporating all of the variables simultaneously. In the significant 
differences that were found, the higher mean was always for whites or males. 

The correlation of the IPR score with the ASVAB tests and the AFQT composite 
are shown in Table 4 for the total sample and by sex and ethnic groups. The 
most noteworthy correlations in Table 4 are those between IPR and General Infor- 
mation (GI) for the total sample and for nonwhites at a significance level of 
.01 and for females at a significance level of .05. The correlation of IPR with 
Mechanical Comprehension (MC) followed a similar pattern, except that the 
correlation was not as high, and for females, the correlation of .31 was signifi- 
cant at only the .06 level. There was also a low but significant correlation 
of IPR with AFQT for the total sample and females. There is a complete absence 
of correlation between IPR and the psychometric test variables for whites and 
males. In the case of the former. General Information (GI) and Automotive Infor- 
mation (AI) are the highest correlations, while General Information and Mechanical 
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TABLE 3 



MEAN TEST SCORES BY RACE AND SEX - ALL SCHOOLS 







WHITE 




NONWHITE 




TOTAL 






male 


female 


total 


male 


female 


total 


male 


female 


total 


N 


17 


26 


43 


11 


11 


22 


28 


37 


65 


GI 


10.29 

1.69 


7-85 

I.83 


8.81 

2.13 


10.09 

2.95 


6 . 46 
2.12 


8.27 

3.12 


10.21 

2.22 


7.43 

1.99 


8.63 

2.50 


WK 


21.88 

5.08 


17-89 

5.60 


19.46 

5.69 


17.36 

7.49 


13-46 

6.79 


15.41 

7.26 


20.11 

6.41 


16.57 

6.23 


18.09 

6.50 


MK 


14.47 

4.19 


13.15 

4.32 


13.67 

4.27 


10.64 

4.43 


11.73 

5-41 


11.18 

4.86 


12.96 

4.62 


12.73 

4.64 


12.83 

4.59 


GS 


12.06 

4.01 


8.92 

3.O3 


10.16 

3.74 


8.91 

2.81 


6.73 

3.04 


7.82 

3-07 


10.82 

3.86 


8.27 

3-16 


9.37 

3.68 


NO 


36.53 

7-98 


36.50 

8.05 


36.51 

7.92 


31.91 

9.75 


36.09 

11.40 


34.00 

10.57 


3^.71 

8.84 


36.38 

9.00 


35*66 

8.90 


AR 


13-47 

4.24 


11.96 
3. 18 


12.56 

3.67 


10.64 

4.23 


10.00 

3-19 


10.32 

3-67 


12.36 

4.39 


11.38 

3-27 


11.80 

3-79 


El 


17-24 

5.87 


12.31 

4.10 


14.26 

509 


14.64 

4.52 


12.82 

2.82 


13.73 

3.80 


16.21 

5.45 


12.46 

3-73 


14.08 

4.88 


SI 


12.88 

3-77 


8.92 

2.56 


10.49 

3.63 


11.64 

3-78 


6.91 

2.17 


9-27 

3.86 


12.39 

3-76 


8.32 

2.59 


10.08 

3-72 


AD 


13-71 

3-87 


14.73 

3.09 


14.33 

3-41 


14.64 

303 


13.09 

4.78 


13-86 

4.10 


14.07 

3.63 


14.24 

3-69 


14.17 

3.63 


SP 


12.41 

5-41 


IO.58 

3.69 


11.30 

4.48 


8.46 
3- 39 


9.00 

4.38 


8.73 

3.83 


10.36 

5.05 


10.11 

3-91 


10.43 

4.42 


MC 


11.94 

3.60 


8.35 

3.05 


9.77 

3.69 


9.82 

2.68 


5.82 
1 .17 


7-82 

2.87 


11.11 

3.38 


7.60 

2.86 


9.11 

3.54 


AI 


9-35 

4.89 


7.15 

3.03 


8.02 

3-97 


3.91 

2.91 


5.82 

2.27 


7.36 

3.00 


9. 18 
4.16 


6 . 76 
2.86 


7.80 
3 . 66 


AFQT 


47.77 

II.36 


40 .42 
9-73 


43.33 

10.89 


36 . 46 
12.45 


32.46 

12.36 


34.46 

12.28 


43.32 

12.87 


38.05 

11.03 


40 . 32 
12.05 


IPR 


659.06 

248.80 


648.00 652.37 
265.64 256.15 


586 . 64 
176.39 


450.00 

152.52 


513.32 

175.69 


630.60 

222.64 


589-14 

252.75 


607.00 

239.32 


Note 


. Top number is test 
deviation. The table 


mean . 

is from 


Bottom 

Sherman 


number 

(1979). 


is test standard 
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Correlations of IPR with ASVAB Tests and AFQT Composite 
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Comprehension (MC) are the highest for the males. Thus, the nonwhites and females 
appear to be the prime contributors to any obtained relationship between the IPR 
scores and the psychometric test variables. 

In order to obtain an indication of the relationship to IPR of all of the 
variables in the study considered simultaneously, the IPR scores were regressed 
in a stepwise manner on the study variables using the SPSS program (Nie, et al . , 
1975). The independent variables included the ASVAB tests, the AFQT composite, 
two dummy variables for the three high schools, a dummy variable for ethnic group, 
and a dummy variable for sex. Interactive variables were created by multiplying 
the General Information and AFQT scores by each of the dummy variables. The 
stepwise procedure was stopped when the adjusted r^ did not improve and the sig- 
nificance of the overall _F ratio for regression failed to improve. The fitted 
equation is shown in Table 5. The obtained r was .239 (adjusted = .189). 

Table 5 

Stepwise Regression of IPR on the Study Variables 



Variables in Equation* B Beta Std Error B F Sig 



GI 


32.22 


.34 


11.83 


7.42 


.01 


AFD3 


3.48 


.33 


1.28 


7.33 


.01 


GID1 


-12.49 


-.21 


6.91 


3.27 


N.S. 


El 


-8.67 


-.18 


6.16 


1.98 


N.S. 



constant 384.96 



/V 

See text for identification of the variables. 



With 4, 60 d.f., the obtained F_ ratio of 4.7 for regression was signifi- 
cant at the .005 level. It should be noted, however, that other interpreta- 
tions of the r^ in stepwise regression might not consider the obtained r^ to be 
statistically significant (Wilkinson, 1979). 



The variables in the equation included General Information (GI); an inter- 
active variable, AFQT times D3, the race dummy (1 = white, 0 = nonwhite); GI 
times a school dummy; and El (Electronics Information). Only the first two con- 
tributed to the equation at a statistically significant level. Thus, for all 
subjects, GI was the best predictor of IPR and for whites, the AFQT was also a 
significant predictor. The latter would seem to incorporate the fact that 
whites scored higher than nonwhites on both the IPR and the AFQT. The latter 
was the best variable to scale the difference between whites and nonwhites on 
the IPR. 



DISCUSSION 



Comparison with Previous Findings 

One of the objectives of the study was to compare its findings with the 
results of the original study using the discrimination learning test (Arima, 1978) . 
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The IPR in the previous study for the self-paced condition was 216.5. The IPR 
in the present study was 607.0. The possible sources of the difference are too 
many to make reliable comparisons. However, the two items that stand out are 
the automation of the present version vs. the manual advance of the earlier test 
and the correction method (contingent reinforcement) used in the present study. 

In the present study, the subject had to press the correct alternative to advance 
the system, whereas the subject in the former study was merely informed by a 
light when he or she made the correct choice by depressing the appropriate response 
buttons . 

There were significant white-nonwhite differences in the machine-paced con- 
dition of the earlier study that apparently disappeared in the self-paced mode. 
There are still significant white-nonwhite differences, but the primary contribu- 
tion to this difference comes from the female subjects where there was a 200- 
point difference favoring the white females. Since there were no sex differences 
among the white subjects, and there was a 136-point difference between male and 
female nonwhite subjects (Table 3) , it appears that the nonwhite females were 
a particularly low-performing sample. There were no females in the previous 
study and no significant difference between white and nonwhite male subjects in 
the present study. Accordingly, there is some justification for concluding that 
there are no reliable differences between white and nonwhite male subjects. More 
data would be required to make a similar statement for the female subjects. 

In the earlier study, there was a statistically significant correlation 
between IPR and the AFQT for the total sample and the white subsample. The corre- 
lation was not significant for the nonwhites. In this study, there is still a 
significant relationship between the IPR and AFQT for the total sample, but the 
significant subsample correlation now occurs in the female subsample. Neverthe- 
less, in view of the repetition of the significant correlation for the larger 
(total) sample and the regression equation in which, as formerly, the AFQT plays 
a significant role for only the white subjects, it is concluded that there is 
modest, but reliable, relationship between the learning performance and the AFQT 
score. This relationship is further explored below. 

Relationships between Learning Performance 
and ASVAB Test Scores 



The relationship between learning performance and the psychometr ically 
derived ASVAB test scores is of particular interest to this study. There is no 
doubt that a close relationship exists between the IPR scores and GI (General 
Information) . This is evident in the degree and pattern of correlations seen 
in Table 4 and in the regression equation in Table 5. As previously stated, GI 
is a test instigated by the Army to provide a "bottom" to the ASVAB. The Army 
needed a test to differentiate the potential usefulness of individuals who 
score low on the basic tests used for screening enlistees. In the present study, 
the highest correlations between IPR and GI scores occurred for those subsamples 
scoring lower on the AFQT — nonwhites and females. For subjects scoring higher 
on the AFQT, it may be that ceiling effects in both variables attenuated the cal- 
culated relationship (correlation) between them. 

To explore the IPR-GI relationship further, the nature of GI, itself, should 
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be examined. First, Table 3 shows that there is no white-nonwhite difference 
in the GI scores. This comparison holds up very well when the comparison is 
made between white and nonwhite males and white and nonwhite females. There 
appears to be a large difference in GI between males and females, just as there 
is in the AFQT scores. It is remarkable — considering that the other tests of 
the ASVAB are longer, more reliable, and generally recognized to be the "heavy- 
weights" in evaluating individuals — that the 15-item GI test should stand out 
as the best predictor of learning performance. The relationship of GI to the 
other ASVAB tests is shown in Table 6. The table reveals that GI is signifi- 
cantly correlated with every other subtest in the battery, and it is also one 
of three tests that identify the first factor (Verbal) extracted from the test 
correlations (U.S. Enlisted Processing Command, undated). The factor was iden- 
tified as the ability to tie words and information together. The foregoing 
would seem to justify the contention that GI is a measure of a strong general 
factor that pervades and dominates the ASVAB tests and especially the compos- 
ites (Cronbach, 1979). 



Table 6 



CORRELATION BETWEEN GENERAL INFORMATION AND OTHER ASVAB SUBTESTS* 



NO 


AD 


WK 


AR 


SP 


MK 


El 


MC 


GS 


SI 


AI 


44 


27 


61 


52 


34 


52 


61 


57 


59 


61 


57 


28 


14 


52 


47 


34 


43 


53 


51 


49 


50 


47 



/'c 

Based on Service standardization sample (upper row) and sample of 2,052 students 
in the 10th, 11th and 12th grades (bottom row). 

As for the IPR score, it has only been identified as a rote learning score. 
It is not a perceptual or speed test, as evidenced by the zero or near-zero 
correlations between it and the Attention to Detail and Numerical Operations 
subtests. It is also not dependent on spatial perception as demonstrated by a 
relatively low correlation with the SP test in Table 4. It is correlated, for 
the general sample, with Word Knowledge, Mechanical Comprehension, and the AFQT. 
On the basis of the differential test results, the IPR score is apparently the 
result of coding (labeling), organizing, and storing in short-term memory for 
immediate retrieval discriminating, information about the nonsense form, stimulus 
pairs. Jensen (1979) states that this sort of a task makes moderate demands on 
the concept he calls g, a general measure of mental ability or intelligence. 

From the preceding analysis of the characteristics of the GI test and the 
IPR measure, it is hypothesized that they are both measuring a general capacity 
for processing and using information and a general characteristic of alertness 
and responsivity to the environment. One would conjecture that either measure 
would be related to the latency of the alerting response as measured in recent 
studies using averaged brain potential responses to a light stimulus. These 
concepts require experimental verification, of course. 
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As a performance measure, the learning task could be improved and made 
more discriminating of individual differences if an individually determined 
stopping criterion were used. For example, 12 successive, correct responses 
might be such a criterion. The intent was to investigate this possibility, 
but the instrumentation proved to be uncooperative. A fixed number of trials, 
as well as paced presentations, penalizes the rapid learner. The information 
processing rate should be calculated for the learning period and not attenuated 
by the time required for reflexive responding once the material has been 
learned . 

Implications for Personnel Selection 



If the IPR were scored with an individual stopping criterion in order to 
increase the variance in performance among individuals, it would seem to be an 
effective and efficient measure of the general intelligence of a person that is 
reasonably culture free. While it apparently measures the same area as a gen- 
eral factor that dominates the ASVAB, it provides the opportunity for those 
with poorer language skills to show their capabilities in the areas of the 
highly language-dominated tests of the ASVAB. With the advent of computerized 
testing, this and similar performance tests should be simple and efficient to 
administer and could provide a greater pool of individuals for selection. More- 
over, there has been little validation of the selection instruments with per- 
formance in the Armed Services because positive correlations are typically not 
found. It could be that simple performance tests used as selectors might pro- 
vide the dimensions to better the validation of selection tests. In the area 
of truth in testing, the performance tests would have a great advantage since 
the correct answers could be tailored for each subject at the time of testing 
if the tasks were designed to permit this option. For example, in the present 
discrimination learning test, the correct member of each pair could be randomly 
determined immediately prior to testing. 

If the ASVAB composites are so dominated by the general factor to make 
them essentially useless for counseling as asserted by Cronbach (1979), the same 
could be said for their use in placement, as employed by the Armed Services. 
Reliable differences must exist between the composites to make either function 
possible. Unfortunately, the correlations among the key technical Navy compos- 
ites range from .88 to .91. Swanson (1978) provides validation data for end-of- 
course grades or time- to-completion of self-paced courses for 19 schools using 
the General Technical Composite and 8 schools using the Mechanical composite. 

In almost all of the cases, the correlations are higher for the Electronics 
composite. The Electronics composite holds up well as the selector with the 
highest correlation for the 9 schools using it as a selector. Judging from 
these limited examples, it would be more efficient just to use the Electronics 
composite as the selector for all of the schools shown in Swanson* s study. 

This study has served to reinforce the notion of a general factor dominating 
the ASVAB tests by calling attention to the pervasive relationship of General 
Information to all of the tests and the fact that the General Information test 
best predicts scores on a discrimination-learning, performance test. 

Finally, attention should be called to the case of the females in this 
study. They are typical of standardization populations in general for the 
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ASVAB (Jensen, 1977) in that their AFQT scores are one-half a standard deviation 
lower than the males, and they do poorly in the trade tests. If the standardi- 
zation norms are strictly applied, the females are very adversely affected in 
selection for service or the more desirable technical courses. They maintain 
eauity only in the areas of Attention to Detail and Numerical Operations that 
are the key elements of the Clerical composite. It should be noted again that 
the mean IPR scores of the white males and females were identical, indicating 
that they were comparable in general cognitive ability. 
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